Data Visualization - ggplot2

Making graphs in R (I)

Today’s agenda

  • A theoretical introduction to ggplot2
  • Producing our first graph in R!
  • The main elements of coding a graph
  • Customizing our graph

About ggplot2

  • The ggplot2 package (the grammar of graphs theory)
    • Hadley Wickham
    • The best option for data viz
    • Layer-based mentality (+)
    • Let’s dive in!

The three main elements of a Graph

  1. The data
  2. The aesthetics
  3. The geometry
  4. Each of these elements are added as layers in our graph

1st layer: data

  • Data: a table that contains only the information we want to see displayed
## Data we will be using to plot!
library(gapminder)
data(gapminder)

## But, let's work with a version smaller of it!
gapminder_brazil <- gapminder |> 
  filter(country == 'Brazil')

head(gapminder_brazil, 5)

country

continent

year

lifeExp

pop

gdpPercap

Brazil

Americas

1,952

50.917

56,602,560

2,108.944

Brazil

Americas

1,957

53.285

65,551,171

2,487.366

Brazil

Americas

1,962

55.665

76,039,390

3,336.586

Brazil

Americas

1,967

57.632

88,049,823

3,429.864

Brazil

Americas

1,972

59.504

100,840,058

4,985.711

1st layer: data

  • Adding the data to ggplot function, so it can plot it!
## Plotting the first layer of our plot
gapminder_brazil |> 
  ggplot()

2nd layer: aesthetics

  • Mapping visual aspects of the graph to columns of our data
  • Way of linking what is visual to our data!
## Plotting the second layer of our plot
gapminder_brazil |> 
  ggplot(aes(x = year, y = lifeExp))

3rd layer: geometry

  • The geometric shape we want to see the information on
  • Lines, points, bars, etc.
## Plotting the third layer of our plot
gapminder_brazil |> 
  ggplot(aes(x = year, y = lifeExp)) +
  geom_point()

4th layer: labs

  • Labs cover most of the text information
  • PS: nothing is ‘necessary’ now!
## Plotting the fourth layer of our plot
gapminder_brazil |> 
  ggplot(aes(x = year, y = lifeExp)) +
  geom_point() +
  labs(title = 'Life expectancy evolution in Brazil',
       subtitle = 'From 1952 to 2007',
       x = 'Years',
       y = 'Life expectancy(age)',
       caption = 'Source: gapminder dataset')

5th layer: theme

  • The background theme of the plot
## Plotting the fifth layer of our plot
gapminder_brazil |> 
  ggplot(aes(x = year, y = lifeExp)) +
  geom_point() +
  labs(title = 'Life expectancy evolution in Brazil',
       subtitle = 'From 1952 to 2007',
       x = 'Years',
       y = 'Life expectancy(age)',
       caption = 'Source: gapminder dataset') +
  theme_light()

More layers: scales & geom

  • The background theme of the plot
## Plotting the mroe layers for out plot!
gapminder_brazil |> 
  ggplot(aes(x = year, y = lifeExp)) +
  geom_point() +
  labs(title = 'Life expectancy evolution in Brazil',
       subtitle = 'From 1952 to 2007',
       x = 'Years',
       y = 'Life expectancy(age)',
       caption = 'Source: gapminder dataset') +
  theme_light() +
  scale_x_continuous(breaks = seq(1952, 2007, by = 5)) +
  geom_line()

ggplot2: before & after 9 lines!

Lecture recap

  • ggplot2 creates graph in a layers mentality
  • The three main components of a graph are:
    • Data
    • Aesthetics
    • Geometries
  • More customization = more layers = more lines of code

Practice exercises!